Optimizing a CFD Fortran code for GRID Computing
نویسنده
چکیده
Computations on clusters and computational GRIDS encounter similar situations where the processors used have different speeds and local RAM. In order to have efficient computations with processors of different speeds and local RAM, load balancing is necessary. That is, faster processors are given more work or larger domains to compute than the slower processors so that all processors finish their work at the same time thus avoiding faster processors waiting for the slower processors to finish. In addition, the programming language must permit dynamic memory allocation so that the executable size is proportional to the size of the partitions. The present version of the AERO code uses the F77 programming language which does not have dynamic memory allocation thus the size of the executable is the same for all processors and leads to situations where the RAM for some processors is too small to run the executable. In this report, we extend the parallel F77 AERO code to F90 which has dynamic memory allocation. The F90 version of the AERO code is mesh independent and because memory is allocated at runtime and memory is only allocated for the code options actually used, the size of the F90 executable is much smaller than the F77 version; as a consequence many tests cases that cannot be run on clusters and computational GRIDS with the F77 version can be easily run with the F90 version. Numerical results for a mesh containing 252K vertices using 8-nina and 8-pf processors running on the MecaGRID using GLOBUS using GLOBUS and heterogeneous partitions ∗ INRIA, 2004 Route des Lucioles, BP. 93, 06902 Sophia-Antipolis, France † Title changed, added results for heterogeneous partitioning, added detailed description of the parallelization implementation in ria -0 00 69 87 7, v er si on 1 19 M ay 2 00 6
منابع مشابه
A case study of the partitioning patterns for domain decomposition method on VPP700E
The most common parallelization strategy for many Computational Mechanics (typified by Computational Fluid Dynamics(CFD) applications) which use structured grids, involves the one directional partition based upon slads of grids. For parallelised versions of CFD codes to scale well we must employ two (or more) dimensional partitions. However, FORTRAN code implementations by multi-directional par...
متن کاملOptimization of a Legacy Open Source Cfd Code for the New High Performance Computing Architectures
Legacy computational fluid dynamics (CFD) software plays a crucial role in today’s simulation-based engineering due to the confidence level established through validation of these codes over time. Fortran, one of the oldest programming languages around, has gained an important role in high performance computing (HPC) environment as the complexity of the problems tackled increased with their siz...
متن کاملPerformance optimizations for scalable CFD applications on hybrid CPU+MIC heterogeneous computing system with millions of cores
For computational fluid dynamics (CFD) applications with a large number of grid points/cells, parallel computing is a common efficient strategy to reduce the computational time. How to achieve the best performance in the modern supercomputer system, especially with heterogeneous computing resources such as hybrid CPU+GPU, or a CPU + Intel Xeon Phi (MIC) co-processors, is still a great challenge...
متن کاملParallel Computation of Turbulent Combustion Processes on Individually Discretized Domains
The aim of this work is the parallel solution of large 3D-CFD problems concerning the numerical description of turbulent combustion and pollutant formation processes in coal-fired utility boilers. The in-house developed multi-domain CFD code AIOLOS for quasi-stationary, weakly compressible, turbulent reacting flows is used. The code is based on a conservative finite-volume formulation with a co...
متن کاملA Fast Double Precision CFD Code using CUDA
We describe a second-order double precision finite volume Boussinesq code implemented using the CUDA platform. We perform detailed validation of the code on a variety of Rayleigh-Benard convection problems and show second order convergence. We obtain matching results with a Fortran code running on a high-end eight-core CPU. The CUDA-accelerated code achieves approximately an eight-time speedup ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005